Skip to content

[5.5.22][core] Bound JDBC waits on VIP failover#3993

Open
zstack-robot-1 wants to merge 1 commit into
5.5.22from
sync/ye.zou/fix/5.5.22-jdbc-vip-timeouts
Open

[5.5.22][core] Bound JDBC waits on VIP failover#3993
zstack-robot-1 wants to merge 1 commit into
5.5.22from
sync/ye.zou/fix/5.5.22-jdbc-vip-timeouts

Conversation

@zstack-robot-1
Copy link
Copy Markdown
Collaborator

Summary

  • Add conservative default MySQL JDBC parameters for MN datasource URLs: connectTimeout=5000, socketTimeout=60000, tcpKeepAlive=true.
  • Apply the defaults both when datasource URLs are derived from DB.url and when DbFacadeDataSource.jdbcUrl / RESTApiDataSource.jdbcUrl are explicitly configured.
  • Keep operator-provided JDBC parameters intact; existing connectTimeout, socketTimeout, or tcpKeepAlive values are not overwritten.
  • Add a unit test covering DB.url query handling, {database} templates, explicit datasource URLs, and explicit timeout preservation.

Background

In a two-MN deployment where the control plane reaches MySQL through a keepalived VIP, old JDBC connections can survive a VIP move and block in Connector/J socket reads. A captured MN thread dump showed a ZStack API worker blocked in SocketInputStream.socketRead0 through ConnectionImpl.commit, so the fix bounds read/connect waits and allows recovery instead of indefinite blocking.

Verification

  • JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl core -DskipTests install
  • JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl plugin/kvm -DskipTests install
  • JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -Dtest=org.zstack.test.core.TestDefaultDbJdbcUrl -Dsurefire.useFile=false test
  • Manual HA validation on 172.20.1.164 / 172.20.1.165: restart keepalived to move VIP, 40/40 zstack-cli query iterations succeeded with only a bounded post-failover delay.

Notes

Local Maven initially failed because existing target/ output was owned by root and the default shell used Java 21; fixed by chowning generated targets and running with JDK 8.

sync from gitlab !9886

Add default MySQL JDBC connect/read timeouts and TCP keepalive to management-node datasource URLs so stale VIP connections fail and recover instead of waiting indefinitely in Connector/J socket reads.

Constraint: 5.5.22 deployments may configure datasource URLs explicitly through zstack.properties, so explicit DbFacadeDataSource and RESTApiDataSource URLs must also receive defaults when they do not already set them.
Rejected: Only changing XML fallback URLs | zstack.properties datasource URLs bypass the XML fallback in deployed environments.
Confidence: high
Scope-risk: moderate
Directive: Do not remove the JDBC timeout defaults without re-testing keepalived VIP movement against a running MN.
Tested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl core -DskipTests install
Tested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl plugin/kvm -DskipTests install
Tested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -Dtest=org.zstack.test.core.TestDefaultDbJdbcUrl -Dsurefire.useFile=false test
Tested: Manual 172.20.1.164/172.20.1.165 keepalived VIP failover, 40/40 zstack-cli queries succeeded with bounded post-failover delay
Not-tested: Full CI suite

Resolves: ZSTAC-0

Change-Id: Ie551c05a6b646c192f01e47fdda902b388cc2b04
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Review Change Stack

概览

新增数据库连接的全局参数配置(connectTimeout/socketTimeout/tcpKeepAlive),在Platform中实现JDBC URL的统一构建逻辑,自动追加默认连接参数并避免重复,通过完整测试验证多种URL场景下的正确处理。

变更

数据库JDBC URL默认参数自动化

Layer / File(s) 摘要
数据库连接参数全局配置
core/src/main/java/org/zstack/core/db/DatabaseGlobalProperty.java
新增DB.connectTimeout(5000ms)、DB.socketTimeout(60000ms)、DB.tcpKeepAlive(true)三个全局属性字段,作为JDBC URL追加参数的配置来源。
JDBC URL处理与集成
core/src/main/java/org/zstack/core/Platform.java
实现六个私有方法处理JDBC URL:buildDbJdbcUrl支持{database}占位符替换或路径追加;prepareDefaultDbJdbcUrl负责在URL上追加默认参数;appendDefaultDbJdbcParametersappendJdbcParameterIfAbsenthasJdbcParameter等辅助方法实现参数检测和避免重复。修改prepareDefaultDbProperties使用新的URL构建方法并为DbFacadeDataSource和RESTApiDataSource两个数据源应用默认参数。
功能验证测试
test/src/test/java/org/zstack/test/core/TestDefaultDbJdbcUrl.java
新增测试类验证JDBC URL处理的五个测试场景:①默认URL直接追加参数;②现有查询参数下追加到Query末尾;③显式配置的参数不被覆盖;④模板URL中{database}替换后Query位置保留;⑤显式数据源JDBC URL被追加默认参数。通过反射调用Platform私有方法,使用@Before/@after维护全局配置的隔离。

评审工作量估算

🎯 3 (中等) | ⏱️ ~20分钟

庆祝诗

🐰 数据库连接焕新颜,
参数自动来追添,
URL 格式多般样,
超时重复避纷扰,
测试五关皆通过!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰准确地反映了变更的主要目的:为 JDBC 连接添加超时限制以解决 VIP 故障转移时的阻塞问题。
Description check ✅ Passed 描述详细说明了变更内容、背景原因和验证方法,完全相关于代码库中新增的 JDBC 参数、工具方法和单元测试。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/ye.zou/fix/5.5.22-jdbc-vip-timeouts

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/src/main/java/org/zstack/core/db/DatabaseGlobalProperty.java`:
- Around line 25-30: 在 DatabaseGlobalProperty 类中为新加的全局属性
DbConnectTimeout、DbSocketTimeout、DbTcpKeepAlive
增加启动时格式校验:把超时属性(DbConnectTimeout、DbSocketTimeout)从无约束的 String 改为数值类型(int/long 或
Integer/Long)或在类的静态初始化/属性加载路径中立即对其进行 Integer.parseInt/Long.parseLong
校验并在解析失败时抛出明确异常;对 DbTcpKeepAlive 进行布尔解析(Boolean.parseBoolean 或更严格的接受
"true"/"false" 校验)并在格式不符合时抛出错误。确保改动出现在 DatabaseGlobalProperty 的定义/初始化逻辑中(涉及字段
DbConnectTimeout、DbSocketTimeout、DbTcpKeepAlive),以便在应用启动阶段就能发现并报出配置错误。

In `@core/src/main/java/org/zstack/core/Platform.java`:
- Line 390: The debug log currently prints the full JDBC URL (defaultedJdbcUrl)
which may contain sensitive info; update the logging in class Platform where
propertyName and defaultedJdbcUrl are used so it does not emit the raw
URL—either log only the propertyName or sanitize/mask credentials and query
parameters from defaultedJdbcUrl before including it in the logger.debug call
(use a helper to strip user:pass@ and sensitive query keys), ensuring the log
message retains context but never contains unmasked connection secrets.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 80456f43-32d3-4ba7-bd03-7ba560c91ae3

📥 Commits

Reviewing files that changed from the base of the PR and between 39c155a and 33fcf90.

⛔ Files ignored due to path filters (2)
  • conf/springConfigXml/DatabaseFacade.xml is excluded by !**/*.xml
  • conf/springConfigXml/RESTFacade.xml is excluded by !**/*.xml
📒 Files selected for processing (3)
  • core/src/main/java/org/zstack/core/Platform.java
  • core/src/main/java/org/zstack/core/db/DatabaseGlobalProperty.java
  • test/src/test/java/org/zstack/test/core/TestDefaultDbJdbcUrl.java

Comment on lines +25 to +30
@GlobalProperty(name="DB.connectTimeout", defaultValue = "5000")
public static String DbConnectTimeout;
@GlobalProperty(name="DB.socketTimeout", defaultValue = "60000")
public static String DbSocketTimeout;
@GlobalProperty(name="DB.tcpKeepAlive", defaultValue = "true")
public static String DbTcpKeepAlive;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

为新增 JDBC 参数增加格式校验,避免错误配置延迟到运行期暴露。

Line 25-30 的新属性当前无格式约束。建议在属性层直接限制:超时必须为数字,tcpKeepAlive 必须为布尔值,这样可在启动阶段尽早失败并给出明确错误。

建议修改
-    `@GlobalProperty`(name="DB.connectTimeout", defaultValue = "5000")
+    `@GlobalProperty`(name="DB.connectTimeout", defaultValue = "5000")
+    `@RegexValues`(value = "^[0-9]+$")
     public static String DbConnectTimeout;
-    `@GlobalProperty`(name="DB.socketTimeout", defaultValue = "60000")
+    `@GlobalProperty`(name="DB.socketTimeout", defaultValue = "60000")
+    `@RegexValues`(value = "^[0-9]+$")
     public static String DbSocketTimeout;
-    `@GlobalProperty`(name="DB.tcpKeepAlive", defaultValue = "true")
+    `@GlobalProperty`(name="DB.tcpKeepAlive", defaultValue = "true")
+    `@RegexValues`(value = "^(?i:true|false)$")
     public static String DbTcpKeepAlive;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/org/zstack/core/db/DatabaseGlobalProperty.java` around
lines 25 - 30, 在 DatabaseGlobalProperty 类中为新加的全局属性
DbConnectTimeout、DbSocketTimeout、DbTcpKeepAlive
增加启动时格式校验:把超时属性(DbConnectTimeout、DbSocketTimeout)从无约束的 String 改为数值类型(int/long 或
Integer/Long)或在类的静态初始化/属性加载路径中立即对其进行 Integer.parseInt/Long.parseLong
校验并在解析失败时抛出明确异常;对 DbTcpKeepAlive 进行布尔解析(Boolean.parseBoolean 或更严格的接受
"true"/"false" 校验)并在格式不符合时抛出错误。确保改动出现在 DatabaseGlobalProperty 的定义/初始化逻辑中(涉及字段
DbConnectTimeout、DbSocketTimeout、DbTcpKeepAlive),以便在应用启动阶段就能发现并报出配置错误。

String defaultedJdbcUrl = appendDefaultDbJdbcParameters(jdbcUrl);
if (!jdbcUrl.equals(defaultedJdbcUrl)) {
System.setProperty(propertyName, defaultedJdbcUrl);
logger.debug(String.format("append default JDBC parameters to %s [%s]", propertyName, defaultedJdbcUrl));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

避免在日志中输出完整 JDBC URL(可能包含敏感参数)。

Line 390 直接打印 defaultedJdbcUrl,若 URL 携带凭据或其他敏感 query,会写入日志。建议只记录属性名,或先脱敏再打印。

建议修改
-            logger.debug(String.format("append default JDBC parameters to %s [%s]", propertyName, defaultedJdbcUrl));
+            logger.debug(String.format("append default JDBC parameters to %s", propertyName));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/org/zstack/core/Platform.java` at line 390, The debug log
currently prints the full JDBC URL (defaultedJdbcUrl) which may contain
sensitive info; update the logging in class Platform where propertyName and
defaultedJdbcUrl are used so it does not emit the raw URL—either log only the
propertyName or sanitize/mask credentials and query parameters from
defaultedJdbcUrl before including it in the logger.debug call (use a helper to
strip user:pass@ and sensitive query keys), ensuring the log message retains
context but never contains unmasked connection secrets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants